Search results for "Pattern discovery"

showing 10 items of 19 documents

Discovering discriminative graph patterns from gene expression data

2016

We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other…

0301 basic medicineSettore INF/01 - Informaticabusiness.industryComputer science0206 medical engineeringpattern discovery subgraph extraction biological networksPattern recognition02 engineering and technologyGraph03 medical and health sciencesComputingMethodologies_PATTERNRECOGNITION030104 developmental biologyDiscriminative modelGraph patternsArtificial intelligencebusiness020602 bioinformaticsBiological networkNetwork modelProceedings of the 31st Annual ACM Symposium on Applied Computing

researchProduct

Sequential Mining Classification

2017

Sequential pattern mining is a data mining technique that aims to extract and analyze frequent subsequences from sequences of events or items with time constraint. Sequence data mining was introduced in 1995 with the well-known Apriori algorithm. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications, etc. GSP, SPAM, SPADE, PrefixSPan and other advanced algorithms followed. View the evolution of data mining techniques based on sequential data, this paper discusses the multiple …

Apriori algorithmComputer sciencebusiness.industryData stream miningConcept mining02 engineering and technologycomputer.software_genreMachine learningGSP AlgorithmTree (data structure)Statistical classificationComputingMethodologies_PATTERNRECOGNITION020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingData miningArtificial intelligencebusinessK-optimal pattern discoverycomputerFSA-Red Algorithm2017 International Conference on Computer and Applications (ICCA)

researchProduct

Discovering representative models in large time series databases

2004

The discovery of frequently occurring patterns in a time series could be important in several application contexts. As an example, the analysis of frequent patterns in biomedical observations could allow to perform diagnosis and/or prognosis. Moreover, the efficient discovery of frequent patterns may play an important role in several data mining tasks such as association rule discovery, clustering and classification. However, in order to identify interesting repetitions, it is necessary to allow errors in the matching patterns; in this context, it is difficult to select one pattern particularly suited to represent the set of similar ones, whereas modelling this set with a single model could…

Association rule learningDiscretizationComputer scienceContext (language use)Correlation and dependencecomputer.software_genreSet (abstract data type)CardinalityKnowledge extractionMotif extraction Pattern discoveryPattern matchingData miningCluster analysisTime complexitycomputer

researchProduct

Pattern Discovery In Biosequences: From Simple To Complex Patterns

2007

Bioinformatics Pattern Discovery String Analysis

researchProduct

Textual data compression in computational biology: Algorithmic techniques

2012

Abstract In a recent review [R. Giancarlo, D. Scaturro, F. Utro, Textual data compression in computational biology: a synopsis, Bioinformatics 25 (2009) 1575–1586] the first systematic organization and presentation of the impact of textual data compression for the analysis of biological data has been given. Its main focus was on a systematic presentation of the key areas of bioinformatics and computational biology where compression has been used together with a technical presentation of how well-known notions from information theory have been adapted to successfully work on biological data. Rather surprisingly, the use of data compression is pervasive in computational biology. Starting from…

Biological dataData Compression Theory and Practice Alignment-free sequence comparison Entropy Huffman coding Hidden Markov Models Kolmogorov complexity Lempel–Ziv compressors Minimum Description Length principle Pattern discovery in bioinformatics Reverse engineering of biological networks Sequence alignmentSettore INF/01 - InformaticaGeneral Computer ScienceKolmogorov complexityComputer scienceSearch engine indexingComputational biologyInformation theoryInformation scienceTheoretical Computer ScienceTechnical PresentationEntropy (information theory)Data compressionComputer Science Review

researchProduct

Entropic Profiles, Maximal Motifs and the Discovery of Significant Repetitions in Genomic Sequences

2014

The degree of predictability of a sequence can be measured by its entropy and it is closely related to its repetitiveness and compressibility. Entropic profiles are useful tools to study the under- and over-representation of subsequences, providing also information about the scale of each conserved DNA region. On the other hand, compact classes of repetitive motifs, such as maximal motifs, have been proved to be useful for the identification of significant repetitions and for the compression of biological sequences. In this paper we show that there is a relationship between entropic profiles and maximal motifs, and in particular we prove that the former are a subset of the latter. As a furt…

CombinatoricsSpeedupSettore INF/01 - InformaticaLinear spacePattern discovery maximal motifsEntropy (information theory)PredictabilityTime complexityMathematics

researchProduct

Characterization and Extraction of Irredundant Tandem Motifs

2012

We address the problem of extracting pairs of subwords (m1,m2) from a text string s of length n, such that, given also an integer constant d in input, m1 and m2 occur in tandem within a maximum distance of d symbols in s. The main effort of this work is to eliminate the possible redundancy from the candidate set of the so found tandem motifs. To this aim, we first introduce the concept of maximality, characterized by four specific conditions, that we show to be not deducible by the corresponding notion of maximality already defined for "simple" (i.e., non tandem) motifs. Then, we further eliminate the remaining redundancy by defining the concept of irredundancy for tandem motifs. We prove t…

Discrete mathematicsRedundancy (information theory)TandemMotif extraction Pattern discoveryText stringLinear numberMathematics

researchProduct

2D motif basis applied to the classification of digital images

2016

The classification of raw data often involves the problem of selecting the appropriate set of features to represent the input data. Different types of features can be extracted from the input dataset, but only some of them are actually relevant for the classification process. Since relevant features are often unknown in real-world problems, many candidate features are usually introduced. This degrades both the speed and the predictive accuracy of the classifier due to the presence of redundancy in the set of candidate features. Recently, a special class of bidimensional motifs, i.e. 2D motif basis has been introduced in the literature. 2D motif basis showed to be powerful in capturing the r…

General Computer ScienceBasis (linear algebra)Contextual image classificationComputer sciencebusiness.industrypattern discovery image clasification motif patterns in 2DPattern recognition0102 computer and information sciences02 engineering and technology01 natural sciencesSet (abstract data type)Digital imageComputingMethodologies_PATTERNRECOGNITION010201 computation theory & mathematics0202 electrical engineering electronic engineering information engineeringRedundancy (engineering)Benchmark (computing)020201 artificial intelligence & image processingArtificial intelligencebusinessClassifier (UML)Image compression

researchProduct

Motif patterns in 2D

2008

AbstractMotif patterns consisting of sequences of intermixed solid and don’t-care characters have been introduced and studied in connection with pattern discovery problems of computational biology and other domains. In order to alleviate the exponential growth of such motifs, notions of maximal saturation and irredundancy have been formulated, whereby more or less compact subsets of the set of all motifs can be extracted, that are capable of expressing all others by suitable combinations. In this paper, we introduce the notion of maximal irredundant motifs in a two-dimensional array and develop initial properties and a combinatorial argument that poses a linear bound on the total number of …

General Computer SciencePattern discoveryTheoretical Computer ScienceCombinatoricsExponential growthMotif extraction Pattern discovery 2D MotifsMotif2D irredundant motifsMotif (music)Pattern matchingRemainderPattern matchingDesign and analysis of algorithmsMathematicsComputer Science(all)Theoretical Computer Science

researchProduct

Flexible pattern discovery with (extended) disjunctive logic programming

2005

The post-genomic era showed up a wide range of new challenging issues for the areas of knowledge discovery and intelligent information management. Among them, the discovery of complex pattern repetitions in string databases plays an important role, specifically in those contexts where even what are to be considered the interesting pattern classes is unknown. This paper provides a contribution in this precise setting, proposing a novel approach, based on disjunctive logic programming extended with several advanced features, for discovering interesting pattern classes from a given data set.

Information managementRange (mathematics)Knowledge extractionbusiness.industryComputer scienceLogical programmingDisjunctive programmingInformation systemMotif extraction Pattern discoveryArtificial intelligenceLevenshtein distancebusinessK-optimal pattern discovery

researchProduct